Evaluating Feature Relevance: Reducing Bias in Relief

نویسندگان

  • José Bins
  • Bruce A. Draper
چکیده

One of the few algorithms that can evaluate features in very large feature sets is Relief [1, 2]. This paper documents a bias in Relief against non-monotonic features, including Gaussian features, and proposes a modification to Relief that removes the bias.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

C-LAS Relief-An Improved Feature Selection Technique in Data Mining

Feature selection or Feature subset selection is a process of reducing the attribute space in the feature set. It is also stated that feature selection is a technique of identifying a subset of features. These subsets of features are selected by removing irrelevant or redundant features in the feature set. A good feature set is said to be that it contains highly correlated features with the cla...

متن کامل

International Journal of Research in Computer Applications and Robotics

Feature subset selection is an effective way for reducing dimensionality, removing irrelevant data, increasing learning accuracy and improving results comprehensibility. This process improved by cluster based FAST Algorithm using MST construction. The instances that define a neighbourhood are used as aggregation points to capture feature relevance. Irrelevant feature subspaces within the neighb...

متن کامل

A New Framework for Distributed Multivariate Feature Selection

Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...

متن کامل

Combination of Feature Selection and Learning Methods for IoT Data Fusion

In this paper, we propose five data fusion schemes for the Internet of Things (IoT) scenario,which are Relief and Perceptron (Re-P), Relief and Genetic Algorithm Particle Swarm Optimization (Re-GAPSO), Genetic Algorithm and Artificial Neural Network (GA-ANN), Rough and Perceptron (Ro-P)and Rough and GAPSO (Ro-GAPSO). All the schemes consist of four stages, including preprocessingthe data set ba...

متن کامل

A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002